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Abstract 

For millimeter-wave (mmWave) massive MIMO systems, the codebook-based analog beamforming 
(including transmit precoding and receive combining) is usually used to compensate the severe attenuation 
of mmWave signals. However, conventional beamforming schemes involve complicated search among 
pre-defined codebooks to find out the optimal pair of analog precoder and analog combiner. To solve 
this problem, by exploring the idea of turbo equalizer together with tabu search (TS) algorithm, we 
propose a Turbo-like beamforming scheme based on TS, which is called Turbo-TS beamforming in 
this paper, to achieve the near-optimal performance with low complexity. Specifically, the proposed 
Turbo-TS beamforming scheme is composed of the following two key components: 1) Based on the 
iterative information exchange between the base station and the user, we design a Turbo-like joint search 
scheme to find out the near-optimal pair of analog precoder and analog combiner; 2) Inspired by the 
idea of TS algorithm developed in artificial intelligence, we propose a TS-based precoding/combining 
scheme to intelligently search the best precoder/combiner in each iteration of Turbo-like joint search 
with low complexity. Analysis shows that the proposed Turbo-TS beamforming can considerably reduce 
the searching complexity, and simulation results verify that it can achieve the near-optimal performance. 
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I. Introduction 

The integration of millimeter-wave (mmWave) and massive multiple-input multiple-output (MIMO) 
is regarded as a promising technique for future 5G wireless communication systems HI, since it can 
provide orders of magnitude increase both in the available bandwidth and the spectral efficiency On 
one hand, the very short wavelength associated with mmWave enables a large antenna array to be easily 
installed in a small physical dimension Q. On the other hand, the large antenna array in massive MIMO 
can provide a sufficient antenna gain to compensate the severe attenuation of mmWave signals due to 
path loss, oxygen absorption, and rainfall effect HI, as the beamforming (including transmit precoding 
and receive combining) technique can concentrate the signal in a narrow beam. 

MmWave massive MIMO systems usually perform beamforming in the analog domain, where the 
transmitted signals or received signals are only controlled by the analog phase shifter (PS) network 
with low hardware cost HI- Compared with traditional digital beamforming, analog beamforming can 
decrease the required number of expensive radio frequency (RF) chains at both the base station (BS) and 
users, which is crucial to reduce the energy consumption and hardware complexity of mmWave massive 
MIMO systems 0- Existing dominant analog beamforming schemes can be generally divided into two 
categories, i.e., the non-codebook beamforming and the codebook-based beamforming. For the non¬ 
codebook beamforming, there are already some excellent schemes. In 0-1171, a low-complexity analog 
beamforming, where two PSs are employed for each entry of the beamforming matrix, is proposed to 
achieve the optimal performance of fully digital beamforming. However, these methods require the perfect 
channel state information (CSI) to be acquired by the BS, which is very challenging in practice, especially 
when the number of RF chains is limited 0. By contrast, the codebook-based beamforming can obtain 
the optimal pair of analog precoder and analog combiner by searching the pre-defined codebook without 
knowing the exact channel. The most intuitive and optimal scheme is full search (FS) beamforming 0. 
However, its complexity increases exponentially with the number of RF chains and quantified bits of 
the angles of arrival and departure (AoA/AoDs). To reduce the searching complexity of codebook-based 
beamforming, some low-complexity schemes, such as the ones adopted by standards IEEE 802.15.3c 0 
and IEEE 802.1 lad HOl . have already been proposed. Furthermore, a multi-level codebook together with a 
ping-pong searching scheme is also proposed in HU . These schemes can reduce the searching complexity 
without obvious performance loss. However, they usually involve a large number of iterations to exchange 
the information between the user and the BS, leading to a high overhead for practical systems. 

To reduce both the searching complexity and the overhead of codebook-based beamforming, in this 
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paper, we propose a Turbo-like beamforming scheme based on tabu search algorithm |[T2l (called as Turbo- 
TS beamforming) with near-optimaji] performance for mm-Wave massive MIMO systems. Specifically, 
the proposed Turbo-TS beamforming scheme is composed of the following two key components: 1) 
Based on the iterative information exchange between the BS and the user, we design a Turbo-like joint 
search scheme to find ouf fhe near-optimal pair of analog precoder and analog combiner; 2) Inspired by 
TS algorithm in artificial infelligence, we develop a TS-based precoding/combining fo intelligently search 
the best precoder/combiner in each iteration of Turbo-like joint search with low complexity. Furthermore, 
the contributions of the proposed TS-based precoding/combing can be summarized in the following three 
aspects: 1) Provide the appropriate definitions of neighborhood, cost, and stopping criterion involved 
in TS-based precoding/combing; 2) Take the exact solution instead of the conventional “move” as tabu 
to guarantee a wider searching range; 3) Propose a restart method by selecting several different initial 
solutions uniformly distributed in the codebooks to further improve the performance. It is shown that the 
proposed Turbo-TS beamforming can considerably reduce the searching complexity. We verify through 
simulations that Turbo-TS beamforming can approach the performance of FS beamforming fSl. 

The rest of this paper is organized as follows. Section m briefly introduces the system model of 
mmWave massive MIMO. Section |III] specifies the proposed Turbo-TS beamforming. The simulation 
results of achievable rate are shown in Section |IVl Finally, conclusions are drawn in Section |Vl 

Notation-. Lower-case and upper-case boldface letters denote vectors and matrices, respectively; (-j^, 
(•)^, and det(-) denote the transpose, conjugate transpose, inversion, and determinant of a matrix, 

respectively; E(-) denotes the expectation; Finally, Iat is the N x N identity matrix. 

II. System Model 

We consider the mm Wave massive MIMO system with beamforming as shown in Fig. 1, where the 
BS employs Nt antennas and RF chains to simultaneously transmit Ng data streams to a user 

with Nr antennas and RF chains. To fully achieve the spatial multiplexing gain, we usually have 
AtRF = AtRF = iV, mu. The Ng independent transmitted data streams in the baseband firstly pass through 
RF chain to be converted into analog signals. After that, the output signals will be precoded by 
an Nt X N^^ analog precoder Pa as x = Pas before transmission, where s is the A^<j x 1 transmitted 
signal vector subject to the normalized power E (ss^) = Note that the analog precoder Pa is 

usually realized by a PS network with low hardware complexity IT], which requires that all elements of 

'Note that “near-optimal” means achieving the performance close to that of the optimal FS beamforming. 
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Pa should satisfy 


P\ 


= Under the narrowband hlock-fading massive MIMO channel |[T3l . the 


X 1 received signal vector r at the user can he presented as 

r = ^HPas + n, (1) 

where p is the transmitted power, H G demotes the channel matrix which will he discussed in 

detail later in this section, and n = [m, • • is the additive white Gaussian noise (AWGN) vector, 

whose entries follow the independent and identical distrihution (i.i.d.) CAf(0, cj^Ia,.)- 

At the user side, an x analog combiner Ca is employed to process the received signal vector 
r as 


y = C^r = VpC^HPAS + C^n 






( 2 ) 




1 

Nr' 


where the elements of Ca have the similar constraints as that of Pa, i.e.. 

Due to the limited number of significant scatters and serious antenna correlation of mm Wave commu¬ 
nication 031, in this paper we adopt the widely used geometric Saleh-Valenzuela channel model 031 . 
where the channel matrix H can be presented as 


* 1=1 


(3) 


where L is the number of significant scatters, and we usually have L < min (At, A^^) for mm Wave 
communication systems due to the sparse nature of scatters, a; G C is the gain of the Ah path including 
the path loss, and (/>[ are the azimuth of AoDs/AoAs of the Ith path, respectively. Finally, fj (0*) and 
tr {4>\) are the antenna array response vectors which depend on the antenna array structure at the BS and 
the user. When the widely used uniform linear arrays (ULAs) are considered, we have 0^ 


ft = 
fr (</>[) = 


1 


N/iVt 

1 

7K 


gi(tVr-l)fc(tsin(0n 


(4) 

(5) 


where k = X denotes the wavelength of the signal, and d is the antenna spacing. 


III. Near-Optimal Turbo-TS Beamforming With Low Complexity 

In this section, we first give a brief introduction of the codebook-based beamforming, which is 
widely used in mmWave massive MIMO systems. After that, a low-complexity near-optimal Turbo- 
TS beamforming scheme is proposed, which consists of Turbo-like joint search scheme and TS-based 
precoding/combining. Finally, the complexity analysis is provided to show the advantage of the proposed 
Turbo-TS beamforming scheme. 
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A. Codebook-based beamforming 


According to the special characteristic of mmWave channel, the heamsteering codehook ||8l is widely 
used. Specifically, let F and W denote the heamsteering codehooks for the analog precoder and analog 
combiner, respectively. If we use (B^^) hits to quantify the AoD (AoA), F (W) will consist of all 
the possible analog precoder (combiner) matrices Pa (Ca), which can be presented as HI 


where the quantified AoD for i = 1, 


> 

II 

ft (<^i) 

ft [ 4 >\) , • 

• ft 

5 

(6) 

Ca = 

f. (<^0 

fr (<^ 2 ) : • 

■ fr 

5 

(7) 

for z = 1 

• • •, at the BS has 2^*^^ possible candidates, i.e., = 

27rn 


where n E |l, • • • Similarly, the quantified AoA f'j ior j = 1, ■ ■ ■,N^^ at the user has 2 

possible candidates, i.e., where n € |l, • • • 2^?'^|. Thus, the cardinalities |J^| of F and |>V| 

of W are and respectively. Then, by jointly searching F and W, the optimal pair 

of analog precoder and analog combiner can be selected by maximizing the achievable rate as |[T3l 

P 


■R-1C^HPaP^H^Ca 


R = max logo Iat, + 

PaGF.CaGW \ N, 

where R„ = ct^C^Ca presents the covariance matrix of noise after combining, and 


„ max log 2 (v?(PA,CA)), (8) 

PaGFjCaGW 


y^(PA,CA) = 


HPaP^ H^Ca 


(9) 


is defined as the cost function. We can observe that to obtain the optimal pair of analog precoder 
and analog combiner, we need to exhaustively search the codebooks F and W. When = 2, 

= 6, the totally required times of search is 1.6 x 10^, which is almost impossible in practice. 
In this paper, we propose a Turbo-TS beamforming to reduce the searching complexity. The proposed 
Turbo-TS beamforming is composed of two key components, i.e.. Turbo-like joint search scheme and 
TS-based precoding/combining, which will be described in detail in the following Section III-B and 
Section III-C, respectively. 


B. Turbo-like joint search scheme 

Based on the idea of the information interaction in the well-known turbo equalizer, we propose a Turbo¬ 
like joint search scheme to find out the near-optimal pair of analog precoder and analog combiner, which 
is shown in Fig. 2. Let p^P*’^ and denote the near-optimal analog precoder and analog combiner 

obtained in the /cth iteration, respectively, where A; = 1, 2, • • • ,K, and K is the pre-defined maximum 
number of iterations. Firstly, the BS selects an initial precoder which can be an arbitrary candidate 
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in T, to transmit a training sequence to the user. Then the user can search the best analog combiner 
After that, the user uses to transmit a training sequence to the BS, and in return the BS 

can search the best analog precoder We repeat such iteration for K times in a similar way as the 

turbo equalizer, and output and as the final pair of analog precoder and analog combiner, 

which is expected to achieve the near-optimal performance as will be verified later in Section |IVl Note 
that in each iteration, searching the best analog precoder (combiner) after a potential analog combiner 
(precoder) has been selected from the codebook W (T) can be realized by the proposed TS-based analog 
precoding/combinging with low complexity, which will be described in detail in the next subsection. 

C. TS-based precoding/combining 

In this subsection, we first focus on the process of searching the best analog precoder Pa after a 
potential analog combiner Ca has been selected. The process of searching the best analog combiner Ca 
after a certain analog precoder Pa has been selected can be derived in the similar way. 

The basic idea of the proposed TS-based analog precoding can be described as follows. TS-based analog 
precoding starts from an initial solution, i.e., an analog precoder matrix selected from the codebook F, 
and defines a neighborhood around it (several analog precoder matrices from T based on a neighboring 
criterion). After that, it selects the most appropriate solution among the neighborhood as the starting point 
for the next iteration, even if it is not the global optimum. During the search in the neighborhood, TS 
attempts to escape from the local optimum by utilizing the concept of “tabu”, whose definition can be 
changed according to different criterions (e.g., convergence speed, complexity, etc). This process will be 
continued until a certain stopping criterion is satisfied, and finally the best solution among all iterations 
will be declared as the final solution. Next, five important aspects of the proposed TS-based precoding, 
including neighborhood definition, cost computation, tabu, stopping criterion, and TS algorithm, will be 
explained in detail as follows. 

1) Neighborhood definition: Note that the mth column of analog precoder Pa can be presented by an 
index qm S which corresponds to the vector as defined in (01) and ([61). Then 

an analog precoder is defined as a neighbor of Pa if: i) it has only one column that is different from 
the corresponding column in Pa; ii) the index difference between the two corresponding columns equals 
one. For example, when = 2 and = 3, for a possible analog precoder Pa = [ft (^) , ft (x)] > 

another precoder [ft (x) ’ (x)] ^ neighbor of Pa- 

ii) 

Let P^ denote the starting point in the tth iteration of the proposed TS-based analog precoding, and 
V ^Pa^^ = V 2 *\ • • 'Viyil presents the neighborhood of Pa\ where |V| is the cardinality of V. 
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According to the neighborhood definition above, it is obvious that |V| = 2N^^. We then define that the 


uth neighbor in V is different from in the [f]th column, and the index of the corresponding 

column is q\u/ 2 '] + (“1) where Q\u/ 2 ~\ is the index of this column. To avoid overflow of the 

definition above, we set 


>(*) ; 


1 + (-1) ^ ^ mod (n,2)^ ^ 

2^?^ + (-1) (“’2) = min ( 2 ^?^ + (-1) (“’2)^ 2^*’'^) 


( 10 ) 

( 11 ) 


For example, the neighborhood of one analog precoder P^^ = [f^ (^) , (^)] is = [f^ (^) , (x)], 

V? = [ft (f) , ft (^)], V« = [f* (f) , ft (f)], and Vf = [ft (f) , ft (f)]. 

2) Cost computation: We define fhe value of fhe cost function (/^(PajCa) in (0) as the reliability 
metric of a possible solution, i.e., a solution Pa leading to a larger value of (/9(Pa,Ca) is a better 
solution. Further, according to the neighborhood definition, we can observe that once we obtain the cost 
of Pa, we do not need to recompute ® to obtain the cost of its neighborhood through information 
exchange between the BS and the user. This is due to the fact that the neighbor V„ of Pa only has 
the [th column that is different from the corresponding one in Pa, then the updated effective channel 
matrix C^HV„ in ® also has the [f]th column that is different from the corresponding one in the 
original effective channel matrix C^HPa, where such difference can be easily calculated since Pa and 
V„ are known. More importantly, this special property indicates that for the proposed TS-based analog 
precoding, we can only estimate the effective channel matrix C^HPa of size x through 
time-domain and/or frequency-domain training sequence ifTSl . whose dimension is much lower than the 
original dimension x Nt of the exact channel matrix H. 

3) Tabu: In the conventional TS algorithm |[T2l . the tabu is usually defined as the “move”, which can be 

regarded as the direction from one solution to another one for the analog precoding problem. The “move” 
can be denoted by (a, b), where a = 1, • • •, denotes that the ath column of the original solution is 
different from that of the current solution, b S { — 1,1} means the changed index of this particular column 
from the original solution to the current solution. Consider the example above, the “move” (direction) 
from [fj (^) , ft (x)] *^0 (x) ’ (x)] written as (1, —1). Regarding the “move” as tabu can 

save storage of the tabu list, since it only requires a tabu list t of size x 1, whose element takes 

the value from {0,1} to indicate whether a move is tabu or not (i.e., 1 is tabu, and 0 is unconstrained). 
However, as shown in Fig. 3 (a), this method may lead to the unexpected fact that one solution will be 
searched twice, and the cost function of the same neighborhood will be computed again. To solve this 
problem, we propose to take the exact solution as tabu. Specifically, let p = 1,2, • • •, present 


the index of a candidate of the analog precoder (solution) out of F with possible candidates. 

Particularly, p can he calculated hy each column index qm (1 < 1^ of this analog precoder as 

^RF 

p = y; - 1) (2«."') ■ +1. (12) 

m=l 

For example, when = 3 and = 2, if an analog precoder has the column indexes {2, 7}, then 
the index of this analog precoder in is p = 15 according to (fT^ . In this way, our method can efficiently 
avoid one solution being searched twice, and therefore a wider searching range can be achieved as shown 
in Fig. 3 (b). Note that the only cost of our method is the increased storage size of the tabu list t from 

to 2^?"-^"". 

4) Stopping criterion: We define flag as a parameter to indicate how long (in terms of number of 
iterations) the global optimal solution has not been updated. That means in the current iteration, if 
a suboptimal solution is selected as the starting point for the next iteration, we have flag = flag + 1, 
otherwise, if the global optimal solution is selected, we set flag = 0. Based on this mechanism, TS- 
based analog precoding will be terminated when either of the following two conditions is satisfied: i) 
The tofal number of iferations reaches fhe pre-defined maximum number of iterations max_iter; ii) The 
number of iterations for the global optimal solution not being updated reaches the pre-defined maximum 
value max_len, i.e., flag = max_len. Note that we usually set max_len < max_iter, which means if TS- 
based analog precoding has already found the optimal solution at the beginning, all the starting points 
in following iterations will be suboptimal, so we don’t need to wait max_iter iterations. Therefore, the 
average searching complexity can be reduced further. 

5) Tabu search algorithm: Let denote the analog precoder achieving the maximum cost function 
@ that has been found until the fth iteration. TS-based analog precoding starts with the initial solution 

Note that in order to improve the performance of TS-based analog precoding, we can select M 
different initial solutions uniformly distributed in F to start TS-based analog precoding M times, then, 
the best one out of M obtained solutions will be declared as the final analog precoder. For each initial 
solution, we set flag = 0. Besides, all the elements of the tabu list t are set as zero. 

Considering the ith iteration, TS-based analog precoding executes as follows: 

Step 1: Compute the cost function ® of the 2Nf-^ neighbors of pj^^ given the effective channel 

^It is worth pointing out that to fully achieve the spatial multiplexing gain, the column index qm should be different for 
different RF chains, i.e., gi 7 ^ (?2 7 ^ • ■ • 7 ^ <?jvRp- AU the possible precoder/combiner matrices that do not obey this constraint 
will be declared as “tabu” to avoid being searched. 


(13) 


matrix Let 

= arg max y?(V„,CA). 

Calculate the index of in T according to (fT^ . Then, will he selected as the starting point for 
the next iteration when either of the following two conditions is satisfied: 

(^(v1,Ca) >¥^(g«,Ca) , (14) 

t (p^) = 0. (15) 

If cannot he selected, we find fhe second hesf solufion as 


= arg max (^(V„,Ca). (16) 

l<u<2N^^ 

Then we decide whefher can he selecfed hy checking (fT4]) and (ITSl) . This procedure will he continued 
unfil one solufion V' is selecfed as fhe sfarfing poinf for fhe nexf iferafion. Nofe fhaf if fhere is no solufion 
satisfying (fT4l) and ([TS] ). all fhe corresponding elemenfs of fhe fahu lisf t will he sef to zero, and fhe 
same procedure above will he repeafed. 

Step 2: Affer a solufion has been selecfed as fhe sfarfing poinf, i.e., = V', we sef 

t(p') = 0, G(*+i) = pJ+'\ if¥p(pJ+'\CA) > ¥.(gW,Ca), 

^ t ip') = 1, G('+i) = G«, if p (pJ+^\ Ca) < p (G«, Ca) . 

TS-based analog precoding will be ferminafed in Step 2 and oufpuf as fhe final solufion if fhe 

sfopping criferion is satisfied. Ofherwise if will go back fo Step 1 and repeaf fhe procedure above unfil 
if safisfies fhe sfopping criferion. 

If is worfh poinfing ouf fhaf searching fhe near-opfimal analog combiner Ca affer a cerfain analog 
precoder Pa has been selecfed can be also solved by similar procedure described above, where fhe 
definitions such as neighborhood should be changed accordingly fo search fhe near-opfimal analog 
combiner Ca. 


D. Complexity analysis 

In fhis subsection, we provide fhe complexify comparison befween fhe proposed Turbo-TS beamforming 
and fhe convenfional FS beamforming. If is worfh poinfing ouf fhaf alfhough fhe proposed Turbo- 
TS beamforming requires some exfra informafion exchange befween fhe BS and fhe UE {K fimes of 
iferafions) as discussed in Secfion 111-B, fhe corresponding overhead is frivial compared wifh fhe searching 
complexify, since K is usually small (e.g., iF = 4 as will be verified by simulafion resulfs). Therefore, 



10 


in this section we evaluate the complexity as the total number of solutions need to he searched. It is 
obvious that the searching complexity of FS beamforming Cps is 


Cfs = 


^RF 



(18) 


By contrast, the searching complexity of the proposed Turbo-TS beamforming Cps is 

Cts = ■ max_iter + • max_iter) MK. (19) 


Comparing (fT^ and ([T^ . we can observe that the complexity of Turbo-TS beamforming is linear with 
and and it is independent of and which indicates that Turbo-TS beamforming 

enjoys a much lower complexity than FS beamforming. Table I shows the comparison of the searching 
complexity between Turbo-TS beamforming and FS beamforming when the numbers of RF chains at the 
BS and the user are = 2, where three cases are considered: 1) For = 4, we set 

max_iter = 500 and max_len = 100, and uniformly select M = 1 different initial solutions to initiate 
the TS-based precoding/combining; 2) For Bf^ = Bf^^ = 5, we set max_iter = 1000, max_len = 200, 
and M = 2; 3) For = 6, we set max_iter = 3000, max_len = 600, and M = 5. Besides, 

for all these cases above, we set the total number of iterations Ff = 4 for the Turbo-like joint search 
scheme. From Table I, we can observe that the proposed Turbo-TS beamforming scheme has much lower 
searching complexity than the conventional FS beamforming, e.g., when = 6, the searching 

complexity of Turbo-TS beamforming is only 2.1% of that of FS beamforming. 


IV. Simulation Results 

We evaluate the performance of the proposed Turbo-TS beamforming in terms of the achievable rate. 
Here we also provide the performance of the recently proposed beam steering scheme ifT^ with continuous 
angles as the benchmark for comparison, since it can be regarded as the upper bound of the proposed 
Turbo-TS beamforming with quantified AoA/AoDs. The system parameters for simulation are described as 
follows: The carrier frequency is set as 28GHz; We generate the channel matrix according to the channel 
model |[T3l described in Section|ni The AoAs/AoDs are assumed to follow the uniform distribution within 
[0,7r]; The complex gain ai of the )th path follows ai ~ CAF(0,1), and the total number of scattering 
propagation paths is set as L = 3; Both the transmit and receive antenna arrays are ULAs with antenna 
spacing d = A/2. Three cases of quantified bits per AoAs/AoDs, i.e., FF/^^ = FF;'/^ = 4, Bf^ = FF^^ = 5, 
and FF/^^ = FF^^ = 6 are evaluated; SNR is defined as Addifionally, fhe parameters used for the 
proposed TS-based precoding/combing are the same as those in Section III-D. 
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At first, we provide the achievable rate performance of Turbo-TS beamforming against different 
parameters to explain why we choose these values as listed in Section III-D. Fig. 4 shows a example when 
Nr X Nf = 16 X 64, = Ng = 2, = 6, and SNR = 0 dB. We can observe that 

when max_iter = 3000 (Fig. 4 (a)), max_len = 600 (Fig. 4 (b)), M = 5 (Fig. 4 (c)), and K = 4 (Fig. 
4 (d)), the proposed Turbo-TS beamforming can achieve more than 90% of the rate of FS beamforming, 
which verifies the rationality of our selection. 

Fig. 5 shows the achievable rate comparison between the conventional FS beamforming and the 
proposed Turbo-TS beamforming for an Nr x Nt = 16 x 64 mm Wave massive MIMO system with 
= ivRF = Ns = 2. We can observe that Turbo-TS beamforming can approach the achievable rate 
of FS beamforming without obvious performance loss. For example, when = 4 and SNR 

= 0 dB, the rate achieved by Turbo-TS beamforming is 7 bit/s/Hz, which is quite close to 7.2 bit/s/Hz 
achieved by FS beamforming. When the number of quantified bits per AoAs/AoDs increases, both Turbo- 
TS beamforming and FS beamforming can achieve better performance close to the beam steering scheme 
with continuous AoAs/AoDs m- Meanwhile, Turbo-TS beamforming can still guarantee the satisfying 
performance quite close to FS beamforming. Considering the considerably reduced searching complexity 
of Turbo-TS beamforming, we can conclude that the proposed Turbo-TS beamforming achieves a much 
better trade-off between performance and complexity. 

Fig. 6 shows the achievable rate comparison for an x = 32 x 128 mm Wave massive MIMO 
system, where the number of RF chains is still set as N^^ = NJ^^ = Ng = 2. From Fig. 6, we can 
observe similar trends as those from Fig. 5. More importantly, comparing Fig. 5 and Fig. 6, we can 
find that the performance of the proposed Turbo-TS beamforming can be improved by increasing the 
number of low-cost antennas instead of increasing the number of expensive RF chains. For example, when 
Nr X Nt = 16 X 64, Bf-^ = Bf-^ = 6, and SNR = 0 dB, Turbo-TS beamforming can achieve the rate 
of 10.1 bit/s/Hz , while when Nr x Nt = 32 x 128, the achievable rate can be increased to 14 bit/s/Hz 
without increasing the number of RF chains. 

V. Conclusions 

In this paper, we propose a Turbo-TS beamforming scheme, which consists of two key components: 1) a 
Turbo-like joint search scheme relying on the iterative information exchange between the BS and the user; 
2) a TS-based precoding/combining utilizing the idea of local search to find the best precoder/combiner 
in each iteration of Turbo-like joint search with low complexity. Analysis has shown that the complexity 
of the proposed scheme is linear with N^^ and N^^, and it is independent of B]^^ and 77^^, which 
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can considerably reduce the complexity of conventional schemes. Simulation results have verified that 

the near-optimal performance of the proposed Turbo-TS beamforming. Our further work will focus on 

extending the proposed Turbo-TS beamforming to the multi-user scenario. 
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Fig. 1. Architecture of mm Wave massive MIMO system with beamforming. 
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Fig. 2. Proposed Turbo-like joint search scheme. 
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Fig. 3. Illustration of how the solution tahu can avoid one solution being searched twice; (a) Conventional “move” tabu; (b) 
Proposed solution tabu. 


TABLE I 

Complexity comparison 

Conventional FS Proposed Turbo-TS Complexity ratio 

beamforming beamforming (TS/FS) 


bA = BA = 4 

57600 

16000 

27.8 % 

= 5 

984064 

64000 

6.5 % 

bA = bA = 6 

16257024 

480000 

2.9 % 





Achievable rate (bps/Hz) Achievable rate (bps/Hz) 
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Fig. 4. Achievable rate of Turbo-TS beamforming against different parameters: (a) max_iter; (b) max_len; (c) M; fd) K. 
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■ e - FS beamforming, [11] 

-©— Turbo-TS beamforming, B^'^=Bj^'^=4 

I - FS beamforming, B^'^=Bj^'^=5 [11] 
-B— Turbo-TS beamforming, B^'^=Bj^'^=5 

■ ^ - FS beamforming, B^'^=B|^'^=6 [11] 
-4— Turbo-TS beamforming, B^'^=Bj^'^=6 
- ★ - Beam steering [20] 
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Fig. 5. Achievable rate comparison for an Nr x A^t = 16 x 64 mmWave massive MIMO system with - 
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-4— Turbo-TS beamforming, B^'^=B|^'^=6 

- Beam steering [20] 
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Fig. 6. Achievable rate comparison for an Nr x Nt = 32 x 128 mm Wave massive MIMO system with N^^ = N^^ — Ns =2. 
















